de Moraes ef al. BMC Pediatrics 20 1 4, 1 4: 1 1 7 
http://www.biomedcentral.eom/1 471 -2431 /1 4/1 1 7 



Pediatrics 



COMMENTARY Open Access 



Potential biases in the classification, analysis 
and interpretations in cross-sectional study: 
connnnentaries - surrounding the article 'Vesting 
heart rate: its correlations and potential for 
screening nnetabolic dysfunctions in adolescents'' 

Augusto Cesar Ferreira de Moraes^'^"\ Alex Jones Flores Cassenote^^ Luis A Moreno^'^ and Heraclito Barbosa Carvalho 
Abstract 

Background: Resting heart rate reflects sympathetic nerve activity. A significant association between resting heart 
rate (HR) and all causes of cardiovascular mortality has been reported by some epidemiologic studies. Despite 
suggestive evidence, resting heart rate (RHR) has not been formally explored as a prognostic factor and potential 
therapeutic outcome and, therefore, is not generally accepted in adolescents. 

Discussion: The core of the debate is the methodological aspects used in "Resting heart rate: its correlations 
and potential for screening metabolic dysfunctions in adolescents"; the points are: cutoff used for cluster RHR, 
two different statistical models used to analyze the same set of variables, one for continuous data, and another for 
categorical data; interpretation of p-value < 0.05, sampling process involving two random stages, analysis of design 
effect and the parameters of screening tests. 

Summary: Aspects that must be taken into account for evaluation of a screening test to measure the potential for 
discrimination for a common variable (population with outcome vs. no outcome population), the main indicators 
are: sensitivity, specificity, accuracy, positive predictive value and negative predictive value. The measures of 
argumentation equality (CI) or difference (p-valor) are important to validate these indicators but do not indicate 
quality of screening. 
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Background 

Recently, Fernandas et al. published an article aimed at 
analyzing the potential effects of screening and resting 
heart rate (RHR) on cardiometabolic risk in adolescents 
[1] in this respected journal. We read the manuscript 
with great interest, since RHR reflects sympathetic nerve 
activity [2,3], and it is an easily accessible clinical 
measurement. A significant association between resting 
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HR and all-causes of cardiovascular mortality has been 
reported in some epidemiological studies [2,4-6]. 

After studying the article, we decided to take the 
opportunity to propose a healthy debate on the meth- 
odological aspects used by Fernandes et al. [1]. With this 
debate, we hope to contribute to the enrichment of the 
reader, especially with regard to statistical analysis and 
interpretation of results. 

The aim of this article is to present a critical appraisal of 
methodological aspects of the article "Resting heart rate: 
its correlations and potential for screening metabolic dys- 
functions in adolescents" presented by BMC Pediatrics. 
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Discussion 

First, with regard to the manuscript methodology, what 
drew our attention was the cutoff used for cluster RHR. 
We see that the authors used cutoffs developed by the 
group of the first author (Fernandes RA) [7]. These cut- 
off points were developed by percentile distribution of a 
sample composed only of children and adolescent males 
and the study published in this journal is composed only 
of adolescents of both sexes. This decision introduced 
classification bias into the study, though it was not rec- 
ognized as a study limitation: children are biologically 
different than adolescents because they have not gone 
through puberty, and there are important and significant 
differences between the sexes concerning the cardiovas- 
cular system [8]. 

Boys had higher pooled prevalence than girls [9,10]. 
There are possible explanations for differences between 
the sexes: 1) the boys had a higher accumulation of 
visceral fat and intra-abdominal fat than girls [11], and 
visceral fat has been associated with higher sympathetic 
activity [12,13] This activation is a key mechanism un- 
derlying the effect of intra-abdominal fat accumulation 
on the development of hypertension [14]. For example, 
increased sympathetic flow may increase sodium re- 
absorption and subsequent increased peripheral vascular 
resistance resulting in increased blood pressure [14]. Also, 
this increased sympathetic activation can be caused by 
increased testosterone concentrations in males. Testoster- 
one, acting as a mediator of the androgen receptor gene 
function [15], has been associated not only with increased 
visceral fat but also with greater vasomotor sympathetic 
tone and blood pressure in adolescent boys, compared to 
girls [16]. Therefore, we believe that the cutoffs used are 
not appropriate for the above and highlight the need for 
the scientific community to develop better diagnostic 
criteria and methodological quality appropriate for each 
sex and age of this important indicator of the cardiovascu- 
lar system. 

According to the title of the article, the authors' object- 
ive was to analyze the impact of RHR for screening meta- 
bolic dysfunctions and also to identify its significance in 
adolescents.. For this, they used two different statistical 
models in order to analyze the same set of variables, one 
for continuous data, and another for categorical data. We 
found this odd, since assumptions for statistical models 
are quite distinct (binary logistic regression model vs. lin- 
ear regression model). So we raise the following questions: 
"Were the linear models used because no association was 
found with categorical variables? Why were the two models 
used? Why analyze variables with continuous data and 
then analyze these variables with categorical data, sequen- 
tially?" We performed these questions, because according 
the objectives; the authors wanted determine the correl- 
ation between RHR and metabolic dysfunctions and also 



the potential power of screening the RHR. What is not 
clear is the use of logistic regression to meet those aims. 
In some instances we recommended that the authors state 
why they have used these tests and provide a reference for 
a definitive description for readers [17]. 

With regard to OR estimates using binary logistic 
regression, the literature shows that the use of OR (esti- 
mated with logistic regression) as a measure of effect in 
the cross-sectional studies has limitations: OR overesti- 
mates RP/RR according to increases of prevalence/inci- 
dence of outcome; between 5% and 10% OR has good 
approximation with RP/RR, after that the risk value is 
very distorted and it serves more to show the association 
direction (risk or protection) and not its magnitude; this 
topic was widely discussed in the nineties by experts 
[18-20], and confirms that OR overestimates the magni- 
tude of the associations between exposures and outcomes, 
particularly in high prevalence [21,22]. The mathematical 
model for logistic regression was developed in the 1970s 
and 1980s to analyze case-control studies and used as a 
proxy for relative risk [23,24], where it is not possible to 
estimate prevalence, another important methodological 
factor neglected by the authors. 

The authors say they used a sampling process involving 
two random stages (schools in the first stage and individ- 
ual classes in the second stage), but give no further details 
of this process, for example, whether the complex sample 
has good accuracy. When using complex samples the 
design effect {deff) helps to estimate how accurate the 
sample was [25-27]. When the sampling process is not ac- 
curate the analyses need to be adjusted for the complexity 
of the sample, and the lack of this setting also impacts the 
associations [28]. Therefore, the impact of risk factors 
estimated by the logistic models, even without statistical 
significance, may not be exactly the absence shown by 
adjusting the primary sampling unit. 

We found the use of RHR to screen for alterations 
in glucose and triglycerides interesting but, according 
to the data presented, we believe that there is no evi- 
dence for this. Accuracy (AUG) for high glucose was 
0.611 (95% CI 0.534-0.688) and high triglycerides, 
0.618 (95% CI 0.531-0.705), both with p-values < 0.05, but 
with low discrimination power — note the lower con- 
fidence bound in some cases is very close to 0.50 
(random event). In other words, if we consider ran- 
dom variations within the CI bounds of AUC, deter- 
mining the presence or absence of high glucose and 
high triglycerides will be as precise as playing a game 
of heads or tails. With regard to the accuracy of re- 
sults, Swets [29] suggested operational cut-off points: 
the test can be non-informative/test equal to chance 
(0.5AUC < 0.7); moderately accurate (0.7 > AUC < 0.9); 
highly accurate (0.9 > AUC < 1.0); and perfect discrim- 
inatory tests (AUC = 1.0). 
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Nowadays a "p-value < 0.05" or significant association is 
commonly employed to illustrate the importance of latest 
scientific finding. We emphasize, however, that statistical 
significance is neither a necessary nor a sufficient condi- 
tion for proving a scientific result [30]. P-values are often 
used to emphasize the certainty of data, but they are only 
a passive read-out of a statistical test and do not take into 
account how well an experiment was designed, for 
example [31]. Goodman [32], in his "The P Value Fallacy" 
explains about the apparent inconsistency in much med- 
ical research, where by studies are designed according to a 
Neyman-Pearson statistical approach (eg. based on formal 
decision making and long-run evaluation of the inferential 
procedures), fixing statistical parameters as significance 
level and power, but are then analyzed by using a Fisherian 
point of view (eg. computing p-values and making infer- 
ence based on its value, in comparison to common 
thresholds). 

We must remember that the screening is conceptually 
defined as tests performed on apparently healthy people 
to identify those at an increased risk of a disease or 
disorder [33]. According to the literature, for screening 
to be accurate, a good screening test must have high 
sensitivity (few false-negative results) and a high specifi- 
city (few false-positive results) [34] and even very good 
tests have poor positive predictive value when applied to 
low-prevalence populations [35]. 

We would like to emphasize that Fernandes et al. [1] 
have provided an important scientific contribution with 
their study on RHR, and that criticism is an integral part 
of scientific progress. As the pediatrician John Locke said, 
"...every step the mind takes in its progress towards know- 
ledge makes some discovery, which is not only new, but 
the best too, for the time at least". 

Summary 

The main indicators that must be taken into account for 
evaluation of a screening test to measure the potential 
for discrimination for a common variable (population 
with outcome vs. no outcome population) are: sensitiv- 
ity, specificity, accuracy, positive predictive value and 
negative predictive value. The measures of argumenta- 
tion equality (CI) or difference (p-valor) are important 
to validate these indicators but do not indicate quality of 
screening. 

We believe the statistical methodologies employed in 
support of science should consider the objectives of the 
paper, type of data available (with the least possible 
transformations) and statistical assumptions in order to 
answer scientific hypotheses. The interpretation of 
statistical data has to be made very carefully, otherwise 
science loses its footing and becomes a relentless pursuit 
of the "p-value < 0.05". 
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